Skip to content

Java: Recognise @Pattern annotation as sanitizer for log injection#21326

Merged
owen-mc merged 8 commits intogithub:mainfrom
owen-mc:java/log-injection-regex-match
Feb 16, 2026
Merged

Java: Recognise @Pattern annotation as sanitizer for log injection#21326
owen-mc merged 8 commits intogithub:mainfrom
owen-mc:java/log-injection-regex-match

Conversation

@owen-mc
Copy link
Contributor

@owen-mc owen-mc commented Feb 14, 2026

This builds on #21310, so you can ignore the first set of commits.

DCA seems to show a performance regression on three repos, though I've tried two of them locally and one was a performance improvement and the other was neutral.

@owen-mc owen-mc force-pushed the java/log-injection-regex-match branch from da2c3be to 19b9d2f Compare February 14, 2026 16:01
@owen-mc owen-mc marked this pull request as ready for review February 15, 2026 00:07
@owen-mc owen-mc requested a review from a team as a code owner February 15, 2026 00:07
Copilot AI review requested due to automatic review settings February 15, 2026 00:07
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request builds upon PR #21310 to recognize the @javax.validation.constraints.Pattern annotation as a sanitizer for log injection and other security queries. The PR introduces a new RegexMatch concept in Java CodeQL and refactors existing regex matching code to use this abstraction.

Changes:

  • Introduces a new RegexMatch concept class to model regular expression matching operations
  • Implements PatternAnnotation to recognize @javax.validation.constraints.Pattern annotations as sanitizers
  • Refactors existing regex match detection code to use the new concept
  • Updates sanitizers for log injection, SSRF, path injection, and regex injection queries to recognize more patterns

Reviewed changes

Copilot reviewed 22 out of 23 changed files in this pull request and generated no comments.

Show a summary per file
File Description
java/ql/lib/semmle/code/java/Concepts.qll New file defining the RegexMatch concept for modeling regex matching operations
java/ql/lib/semmle/code/java/JDK.qll Updated StringMatchesCall to implement RegexMatch::Range
java/ql/lib/semmle/code/java/frameworks/Regex.qll Added PatternMatchesCall, MatcherMatchesCall, and related classes implementing RegexMatch::Range
java/ql/lib/semmle/code/java/frameworks/JavaxAnnotations.qll Added PatternAnnotation class to model @Pattern annotations as regex matches
java/ql/lib/semmle/code/java/security/Sanitizers.qll Refactored to use RegexMatch concept; added special handling for annotations as barriers
java/ql/lib/semmle/code/java/security/LogInjection.qll Extended to recognize @Pattern annotations as sanitizers and refactored guard logic
java/ql/lib/semmle/code/java/security/PathSanitizer.qll Updated to use RegexMatch concept instead of specific method calls
java/ql/lib/semmle/code/java/security/regexp/RegexInjection.qll Simplified to use PatternCompileCall class
java/ql/src/experimental/Security/CWE/CWE-625/Regex.qll Deprecated module removed
java/ql/src/experimental/Security/CWE/CWE-625/PermissiveDotRegexQuery.qll Updated to use new regex call classes instead of deprecated module
java/ql/test/stubs/javax-validation-constraints/* Added stub files for javax.validation.constraints package
java/ql/test/query-tests/security/CWE-918/* Added comprehensive tests for SSRF sanitization with @Pattern annotations
java/ql/test/query-tests/security/CWE-117/* Added tests for log injection sanitization with @Pattern annotations
java/ql/lib/change-notes/* Added appropriate change notes documenting the new functionality

@owen-mc
Copy link
Contributor Author

owen-mc commented Feb 15, 2026

Something weird is going on with DCA. (Note that it's only running the log injection query.) It shows a big increase in analysis time for apache/hive, gradle/gradle and apache/flink. But when I download the db for apache/hive or apache/flink and run it locally I see a significant improvement in performance*. And for gradle/gradle it seems the same, or maybe a slight improvement. Maybe it's to do with DCA runners not having as much memory?

* because pulling out the predicate regexPreventsLogInjection with a bindingset on regex forces us to only get the string value of the relevant compile time constant exprs, instead of getting the string value of all compile time constant exprs, seeing which match the given regexes, and then narrowing down to exprs that appear in the right places.

@owen-mc owen-mc force-pushed the java/log-injection-regex-match branch from 99cd7a3 to 34acc84 Compare February 15, 2026 14:53
@aschackmull
Copy link
Contributor

Something weird is going on with DCA.

I tried to reproduce, and initially got the same "weird" result as you in the sense that I couldn't reproduce any regression - yet the dca results look consistent. Then I checked the branches of the dca run and noticed a difference: The dca tests an earlier version of your branch that looks like it has a substantially different diff - it tests da2c3be instead of the current PR head 34acc84. So try to rerun the dca with the current PR head and then let's see if the regression goes away.

@owen-mc
Copy link
Contributor Author

owen-mc commented Feb 16, 2026

@aschackmull Can you review this PR, so that if DCA is good then I can merge straight away and maybe get it in before the release branches are cut? Reminder: only the last 7 commits are new, starting at "Add failing log injection test for @pattern validation".

@owen-mc
Copy link
Contributor Author

owen-mc commented Feb 16, 2026

The latest DCA run looks better. apache/flink now shows a significant reduction in analysis time. There is an odd result for apache__commons-logging@7c70d1a1___buildless, though with the high variance for v1 I think it is most likely that this is a network issue. Indeed, now I look at the 3 runs I see that one of them has a very high value for v1.

@aschackmull
Copy link
Contributor

The test currently includes Fixed spurious result: Alert as expected test failures - could you remove the SPURIOUS annotations in the java test file to get rid of those? And then we might as well rebase to clear the PR diff of #21310.

@owen-mc owen-mc force-pushed the java/log-injection-regex-match branch from 34acc84 to cf73d96 Compare February 16, 2026 12:03
@owen-mc
Copy link
Contributor Author

owen-mc commented Feb 16, 2026

@aschackmull I have updated the tests, rebased and force-pushed.

Copy link
Contributor

@aschackmull aschackmull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@owen-mc owen-mc merged commit 7742a56 into github:main Feb 16, 2026
19 of 23 checks passed
@owen-mc owen-mc deleted the java/log-injection-regex-match branch February 16, 2026 12:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants